Perceptual harmonic cepstral coefficients as the front-end for speech recognition

نویسندگان

Liang Gu

Kenneth Rose

چکیده

Perceptual harmonic cepstral coefficients (PHCC) are proposed as features to extract for speech recognition. Pitch estimation and classification into voiced, unvoiced, and transitional speech are performed by a spectro-temporal auto-correlation technique. A peak picking algorithm is then employed to precisely locate pitch harmonics. A weighting function, which depends on the classification and the pitch harmonics, is applied to the power spectrum and ensures accurate representation of the voiced speech spectral envelope. The harmonics weighted power spectrum undergoes mel-scaled band-pass filtering, and the logenergy of the filters’ output is discrete cosine transformed to produce cepstral coefficients. For perceptual considerations, within-filter cubic-root amplitude compression is applied to reduce amplitude variation without compromise of the gain invariance properties. Experiments show substantial recognition gains of PHCC over MFCC, with 48% and 15% error rate reduction for the Mandarin digit database and E-set, respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Split-band perceptual harmonic cepstral coefficients as acoustic features for speech recognition

This paper presents a significant modification of our previously proposed speech recognizer’s front-end based on perceptual harmonic cepstral coefficients. The spectrum is split into two frequency bands, which correspond to the harmonic and non-harmonic components. A weighting function, which depends both on the voiced/unvoiced/ transitional classification and on the prominence of harmonic stru...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition

In this paper, we present robust feature extractors that incorporate a regularized minimum variance distortionless response (RMVDR) spectrum estimator instead of the discrete Fourier transform-based direct spectrum estimator, used in many front-ends including the conventional MFCC, to estimate the speech power spectrum. Direct spectrum estimators, e.g., single tapered periodogram, have high var...

متن کامل

Perceptual harmonic cepstral coefficients for speech recognition in noisy environment

Perceptual harmonic cepstral coefficients (PHCC) are proposed as features to extract from speech for recognition in noisy environments. A weighting function, which depends on the prominence of the harmonic structure, is applied to the power spectrum to ensure accurate representation of the voiced speech spectral envelope. The harmonics weighted power spectrum undergoes mel-scaled band-pass filt...

متن کامل

On compensating the Mel-frequency cepstral coefficients for noisy speech recognition

This paper describes a novel noise-robust automatic speech recognition (ASR) front-end that employs a combination of Mel-filterbank output compensation and cumulative distribution mapping of cepstral coefficients with truncated Gaussian distribution. Recognition experiments on the Aurora II connected digits database reveal that the proposed front-end achieves an average digit recognition accura...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

Perceptual harmonic cepstral coefficients as the front-end for speech recognition

نویسندگان

چکیده

منابع مشابه

Split-band perceptual harmonic cepstral coefficients as acoustic features for speech recognition

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition

Perceptual harmonic cepstral coefficients for speech recognition in noisy environment

On compensating the Mel-frequency cepstral coefficients for noisy speech recognition

عنوان ژورنال:

اشتراک گذاری